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Abstract 

The distribution of inequivalent geometries occurring during self-assembly of the major capsid 
protein in thermodynamic equilibrium is determined based on a master equation approach. These 
results are implemented to characterize the assembly of SV40 virus and to obtain information 
on the putative pathways controlling the progressive build-up of the SV40 capsid. The experi- 
mental testability of the predictions is assessed and an analysis of the geometries of the assembly 
intermediates on the dominant pathways is used to identify targets for antiviral drug design. 



1 Introduction 

Manipulating the assembly of viral capsids is one way of interfering with the viral replication cycle 
and hence a possible avenue for anti-viral drug design. Despite of its importance the theory of viral 
capsid assembly is still in its infancy. A first model for the self-assembly of a small plant virus was 
pioneered by Zlotnick |25| , exploring the assembly of a dodecagonal shape by a cascade of single order 
reactions. It has since been extended to more involved scenarios [51 1261 12~7| . including a study of the 
energy landscape underlying assembly [7], that is similar to approaches in protein folding [2] or the 
energy landscape description of association reactions [21 1221 12*5| . These results have been used to 
investigate the possibility of inhibiting assembly via an anti-viral drug in the case of Herpes Virus 
[2Hj . Related approaches include molecular dynamics studies of viral capsid assembly |13l H4] , and a 
molecular dynamics-like formalism that is implemented in connection with a "local rules" mechanism 
that regulates capsid assembly ^ ^] . 

A characteristic feature of these models is the fact that the bonding structures of all building 
blocks are treated on an equal footing. While this is justified for a large number of viruses, it is 
an inappropriate simplification for important families of virus such as the Papovaviridae, which are 
linked to cancer and are hence of particular interest for the public health sector. For example, the 
(pseudo-) T=7 capsids in this family are known to be composed of two different types of pentameric 
building blocks which are distinguished by their local bonding structure. This requires a mathematical 
representation of these building blocks that takes the differences in the local bonding environments 
into account. The tiling approach for the description of viral capsids [191 120| provides an appropriate 
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mathematical framework for this. It encodes the locations of proteins and inter-subunit bonds in terms 
of tilings, i.e. tessellations that represent the surface structure of the capsids. Since the vertex atlas of 
these tilings, i.e. the collection of all distinct local configurations around vertices in the tiling, encodes 
all different types of capsomeres and their bonding structures, it provides appropriate building blocks 
for the construction of assembly models. An assembly model for (pscudo-) T=7 capsids in the family 
of Papovaviridae has been introduced along these lines in ^Uj. In this reference, a tree structure - the 
assembly tree - has been determined that encodes all energetically preferred pathways of assembly, 
and it specifics how the tree structure changes in dependence on the association constants. Moreover, 
the concentrations of the statistically dominant assembly intermediates (i.e. inequivalcnt shapes at 
various stages of capsid construction that are located on all pathways) have been computed. 

For applications to anti-viral drug design, it is important to control not only the statistically 
dominant assembly intermediates, but also the concentrations of all other assembly intermediates, 
and, based on this, to determine the most probable assembly pathway (s). This issue is addressed in 
this paper, where we adopt a master equation approach for the computation of the concentrations of 
all assembly intermediates in the assembly tree and use this information to determine the putative 
pathways controlling the progressive build-up of the capsid. In particular, in section El we introduce 
the master equation approach as a tool for the computation of the concentrations of the assembly 
intermediates for arbitrary viruses from a thermodynamical point of view. In section [3] we apply 
this formalism to SV40 virus and determine the equilibrium concentrations of the various assembly 
intermediates. We discuss how this information can be used to determine the dominant pathway of 
assembly, and show that the dominant pathways have intermediates with a characteristic structure 
that may potentially be exploited in the framework of anti-viral drug design. In the final section we 
summarize our results and assess the implications for other families of viruses. 



2 The master equation approach 



The formalism presented in this section allows to compute the probability distribution of the inequiv- 
alent configurations (also called species or assembly intermediates) that appear during self-assembly 
of the major capsid protein of a virus in thermodynamic equilibrium. Assume that the different 
species are indexed from 1 to N, where species 1 corresponds to the fundamental building block of 
the capsid, TV to the final capsid, and every other assembly intermediate, i, is formed by rii copies 
of building block 1. As customary, we consider capsid assembly as a sequence of low order reactions, 
and hence assume that the formation of the capsid occurs from the attachment or detachment of 
single fundamental units to the partially-formed capsid. From a phenomenological point of view, the 
equilibrium thermodynamics of the process is described through second-order association constants. 
Indicating with [a] and [b] the concentrations of two species whose number of constitutive building 
blocks is, respectively, n a and ny, = n a + 1, we have that their association constant is given by 



Kh 



\eq [l]e<7 



(1) 



where [1] denotes the concentration of the fundamental building block and the subscripts are used to 
stress that the concentrations pertaint to the equilibrium (stationary) state. This phenomenological 
relationship can be related to the fundamental cntropic and energetic aspects of the association process 
through the following factorization (as in |25], |10|h 

1 -AC(l,«) , , 

K bM = —Si S ba e^^~ , 2 
c 

where Si denotes the geometric degeneracy of the fundamental "incoming" subunit, Sba : = 0(b) /0(a) 
corresponds to the ratio of the orders of the discrete rotational symmetry groups of the two species 
0(b) and 0(a), AG (b, a) is the free energy difference associated to the bonds formed by the incoming 
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building block, and R and T denote the gas constant (R = 1.987caZi^ _1 moZ _1 ) and, respectively, 
temperature (chosen as room temperature T = 298-ftT). The quantity c , having the dimension of a 
concentration, can be put in unique correspondence with the total concentration of elementary blocks 
present in the system, as will be shown below. 

The hierarchy of association constants of the various pairs of intermediate species differing by one 
fundamental building block can be used in recursive schemes for obtaining the equilibrium probabilities 
of any species. In fact, by combining equations Q and and introducing the adimensional quantity 
Meg := wc ^tain the fundamental relationship: 

f^ = S lSba W eq e^ (3) 

l u \ eq 

which, used recursively, yields the formal expression of the equilibrium concentration of a generic 
species, [i], (i ^ 1), 



i O(l) AG(i.l) rii — 1 

Me « = ^ o^f T[1]e<?[1 ^ ' (4) 

where ni is the number of fundamental building blocks in species i and O(l) and 0(i) refer to the 
orders of discrete rotational symmetry of subunit 1 and species i, respectively. Notice that equation 
(@J depends implicitly on Co through [l] eq - This gives the possibility of relating Co to the total 
concentration of fundamental building blocks, [c*], through the relationship, 



N 

[c*] = ]T n, [i] . (5) 

i=l 

Notice that this relationship expresses the law of conservation of the total number of fundamental 
building blocks present in the system (be they "free" or assembled in intermediate species) and 
therefore is valid not only in equilibrium. The association constants of equation can be used beyond 
the equilibrium framework since they constitute the starting point for formulating phcnomcnological 
kinetic equations apt to capture the time evolution of the system given the initial concentration of 
the various species. Within a vanishingly small time interval the concentration of a given species [i] 
(we assume z ^ 1) can change only due to the gain or loss of one fundamental building block: 

f = EM w ^,i + E B M Wi,i £ [i] [i] w iim W w v ( 6 ) 

m l m I 

where we have omitted the explicit time dependence of the species concentrations. The indices I and 
m in the sums refer to the species formed by one less, respectively one more, fundamental building 
block than species i. Finally, Wi_j denotes the time-independent rate at which transitions are made 
from configuration i to configuration j. The dynamics of the system is thus fully described by the 
set of coupled equations © for each i ^ 1, supplemented with the conservation law in equation 
The W's must be appropriately related to the association constants, K, to ensure that the correct 
equilibrium conditions Q are recovered at large times when the stationary regime is reached (i.e. 
when d[i]/dt = for all species i). 

Among all possible initial conditions for the above mentioned kinetic evolution a particularly 
appealing and interesting one is represented by the case where the only species being present is the 
one associated with the fundamental building blocks. At this initial time (t = 0) the state of the 
system would then be described as [l]t=o = i c *] an d [j]t=o = for all species j > 1. The lack 
of precise experimental characterization of the equilibrium concentrations of the various species in 
biologically-relevant conditions has lead previous theoretical studies to focus on particularly simple 
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equilibrium situations, namely the one in which the only dominant species arc that of the fully-formed 
capsid and that of the fundamental building blocks; both species are considered as equiprobable so 
that [N] eq = [l] eq while, for all other species j, [j] eq « 0. 

Under these assumptions, the concentration of the fundamental species [1], is therefore expected 
to take on a rather limited range of values. We build on this observation to simplify the description of 
the assembly process of equation (0 through a set of effective first order reactions. The key ingredient 
in our analysis is to modify the right hand side of equation © so to neglect the time-dependence 
of the concentration [1] and absorb it in new effective time-independent transition rates, W . It is 
convenient to recast the kinetics obtained by this simplification of equation JBJ not in terms of the 
concentration of the ith species but of the equivalent probability of occurrence, Pi(t) = [i]/J2j[j]- 
The discrete time evolution of Pi is therefore governed by the following master equation 



p i (t + At) = P i (t)+At ^ /',:/;• u ,„ Y^i'jnw.jj (7) 

where At is the time step of the discretized evolution (assumed to be sufficiently small to justify the 
linearization of the continuous time evolution). As before, the only non-zero entries in the transition 
matrix W are those connecting species which differ by the addition/removal of one fundamental 
building block. The matrix W = (Wij) has to satisfy a number of properties (see [5]): 

• At Ylj Wij = 1 (normalization condition) , 

• there exists a finite integer I such that [W']»,j > Vi, j (ergodic condition) . 

It is easy to check that the first condition ensures that Pj (t) is constant at all times while the 
second one requires that any two configurations must be connected by a finite number of transitions. 
The above conditions are sufficient to ensure the onset of equilibrium at t — > oo, irrespective of the 
initial condition of the system. From the stationarity of the equilibrium distribution we obtain, from 
equation JJJ the generalised balance condition: 



II ,, /',"'' lb,.,; 0. (8) 



The constraints entailed by the balance equation are not sufficient to identify the matrix W uniquely. 
We solve this ambiguity by adopting the commonly-employed Metropolis criterion within a detailed 
balance scheme (Hj- The detailed balance condition requires that each term in the sum of eqn. 
© is zero. The Metropolis criterion further specifies the precise form of the W matrix elements. 
Accordingly, for two different species i and j, which differ by the addition/removal of one fundamental 
building block (otherwise Wij = 0) one has 

J l if < p e i 

Wi « = { PfjPl q otherwise ' ' (9) 

The diagonal elements are instead obtained from the normalization condition: 

W » = it ~ E W *>i ■ ( 10 ) 

It is easy to check that with this choice of W the equilibrium distribution is stationary under the 
evolution of equation Q . In the particular context of capsid assembly, the ratio of probabilities in 
equation |JS} is straightforwardly obtained from equation @ given the proportionality of [i] and Pi. 
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Some caveats must be borne in mind when interpreting the outcome of the master equation as 
a kinetic process. While the equilibrium distribution is insensitive to the choice of the W's, as long 
as they satisfy equation (JSJ), the kinetics is strongly affected by the form of the W matrix. Our 
choice follows the common practice of adopting the Metropolis criterion, but remains only one of 
the equivalent possibilities in terms of correct asymptotic behaviour. Also, we stress again that 
recasting equation 10 into the master equation of equation {7J was possible upon neglecting the time 
dependence of [1] in ©. 

3 Application of the formalism to SV40 virus 

In this section we apply the master equation formalism to the assembly of SV40 virus. The capsid 
of SV40 is composed of 72 pcntameric building blocks that adopt two different types of local con- 
figurations (with respect to their local bonding structure) in the capsid. These can be modelled via 
the tiling approach shown in Fig. ^ adapted from |19| . In particular, dimer- and trimer interactions, 



that is interactions between two, respectively three, protein subunits, are represented geometrically 
as rhombs, respectively kites. These rhombs and kites (see also Fig. |2J) tesscllate the surface of the 
capsid, and encode the locations of the protein subunits and the inter-subunit bonds. In particular, 



tiles have to be interpreted as follows: dots on the tiles correspond to angles of equal magnitude and 
mark the locations of the protein subunits. The locations of the inter-subunit bonds (C-terminal arm 
extensions in the case of SV40) correspond to the straight lines connecting these dots. 

SV40 has an icosahcdrally symmetric capsid, and therefore also the tiling has this symmetry. 
From the tiling, one can see that the twelve pentamers located at the 5-fold vertices of the capsid 
are surrounded by trimer-interactions, and the 60 other pentamers by a combination of dimer- and 




Figure 1: The tiling representing the viral capsid of SV40. 




Figure 2: Tiles corresponding to the tiling in Fig. 
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trimcr-interactions. There are hence two different types of local environments, which are shown in 



Instead of using the tiles themselves, it is more convenient - and mathematically equivalent - to 
work with the pentagons and hexagons shown superimposed on the tiles (Fig. [31 left), where the 
edges of the pentagons and hexagons are labelled according to the association energies related to the 
tiles they bisect (same figure, right). In particular, for SV40 there are 3 different types of bonds, that 
correspond to the association energies a for a single C-terminal arm in a kite tile, association energy 
b for a quasi-dimcr bond ("yellow- yellow" rhombs in Fig. ^ named after their location on a local 
2-fold symmetry axis), and c for a strict dimer bond ("blue-red" rhombs in the figure, named after 
their location on a 2-fold symmetry axis). They have to be taken into account when computing the 
free energies AG in equation 

In ^Uj the assembly of SV40 virus is considered based on these pentagonal and hexagonal building 
blocks. Assembly is considered as a cascade of low order reactions by association of a single building 
block at a time. The complete characterization of the assembly thermodynamics would require the 
full classification of all possible species of correctly-connected pentamers and hexamers, that is all 
combinatorially possible combinations of these building blocks, even those comprising more than the 
72 blocks blocks necessary to form the full capsid (and may correspond to malformed capcids or other 
types of closed or open structures). Obviously, this exhaustive enumeration cannot be accomplished. 
Consequently it is necessary to reduce the number of building blocks to a manageable size by discarding 
configurations that are supposed to play an unimportant role in the assembly process. To illustrate 
the reduction procedure employed in this study it is convenient to regard the various species as the 
nodes of a graph. The links in the graph connect species which differ by the attachment/removal of 
a pentamer of hcxamer. The assembly tree containing the species to which we restrict our attention 
is constructed as follows. Without loss of generality the first node is constituted by the fundamental 
pentamer. We then consider all the geometrically incquivalcnt species obtained via the addition of an 
extra building unit. Of these species we retain only the one (or ones in case of degeneracy) having the 
lowest free energy. Each of the retained species become new nodes of the graph (and are linked to the 
parent node). In correspondence of each of these offspring nodes we carry out the search of minimum 
free-energy descendants, as before. The process ultimately ends when the node corresponding to the 
full capsid is reached. 

Within this limited set of species, the possible assembly pathways are represented as walks on the 
tree connecting linked nodes. We stress again that, due to the selection criterion based on the free 
energy minimization, the assembly tree that we obtain represents only a subset of the combinatorially 
possible nodes. The resulting "minimal tree" is uniquely encoded by the energy parameters, a, b and 
c. Since one of these three quantities can be taken as the unit of energy, we have that the choice of 
the assembly tree depends on two adimensional parameters: a/c and b/c. In ^U], the phase diagram 
of the system corresponding to these parameters is depicted. It is shown that one can identify convex 
regions in this two-dimensional parameter space where parameters can be varied without affecting the 
assembly tree (but, obviously, the probability of occurrence of the various species will change from 
point to point in the same region). 

While no direct measurement of the association energies a, b and c is available, an estimate for the 



Fig. El 




Figure 3: The building blocks for SV40 capsid assembly. 
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assembly free energies is provided by the VIPER database. We discuss the assembly tree associated 
with the point x = b/c ~ 0.92 and y = a/c w 0.47, in the phase space, as it corresponds to the ratios 
of the association energies listed on the VIPER webpages for SV40. 

In Fig. 0] we portray the portion of the assembly tree for SV40 limited to species with up to 
16 building blocks. It contains 19 assembly intermediates, out of a total number of 505 assembly 




Figure 4: The start of the assembly tree for SV40. 

intermediates that are encoded by the minimal free-energy assembly tree 1101 . The complicated 
structure of the assembly tree makes impractical the use of relation for computing the relative 
concentrations of the assembly intermediates. Therefore, only the concentrations of the dominant 
assembly intermediates, i.e. those located on all paths in the assembly tree such as, for example, the 
intermediates denoted as 1, 01 to 5, 01 and 8, 01 to 12, 01 in Fig. 0] have been computed prior to this 
work (see ref. |10|)- However, the concentrations of all assembly intermediates are needed in order 
to obtain clues about the putative pathways of assembly, and hence about possible mechanisms for 
anti-viral drug design. 

To compute the equilibrium probabilities of occupation of the various species, and to identify the 
dominant kinetic pathway within the assembly tree of Fig. 0]we must go beyond the mere specification 
of the adimensional parameters a/c and b/c and consider the absolute values of the energies a, 6, c. 
The absolute value of the nominal free energies provided by the VIPER website are more than an 
order of magnitude larger than the typical interaction energies of biomolcculcs (usually of the order 
of a few Kcal/mol). The particularly high VIPER values may indeed reflect the shortcoming of the 
rigid-unit approximations involved in the potential extraction scheme. We shall therefore obtain an 
indication of the absolute scale for the SV40 free energies and of the concentration cq, by following 
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some guidelines inspired by previous theoretical and experimental work. 

The first quantitative experimental input pertains to the overall concentration of fundamental 
units present in solution, [c*], which has, typically, of the order of 10 /iM. Secondly, as anticipated, we 
wish to describe the situation where the dominant species in equilibrium are [1] and [N] 4 . Assuming 
that [N] eq and [l] eg are equiprobable one has: 

1= [N]eq =l2( Ne 1) 71 e ,a,RT ^ (n) 
[ljeg 5 

where k := 180 + 60b/a + 30c/a « 360.8571 5 . 

The above requirement provides a relationship through the free energy scale, a, and cq (entering 

implicitly through [l] eq ). The second condition typing c and a is obtained by requiring that the 
concentration of species [2] is significantly smaller than [1]. This requirement implies, through the 
chain relations of equation 01 that the dominant species are indeed [1] and [N], so that 

505 rn k _ . __ _ x 

t c *] = E n *ocfc) e ^^C [1]eq w [1]eq + 72 [N]eq = 73 [1]eq ■ (12) 

We discuss here the case where [2] = [1]/10, which is satisfied when a takes on the realistic 
value, a ks —0.7kcal/mol. All our conclusions about the dominant pathway in the assembly tree 
are unchanged if much larger values of a (in modulus) are used, although these might result in 
unrealistically low concentrations of intermediates. 

Therefore the assembly tree for SV40 has been computed for the values a = — 0.7kcalmol , 
b = — 1.37kcalmol -1 and c = — 1.49kcalmol _1 . The shortest pathways in the tree that connect [1] 
and [N] (i.e. those without loops) contains precisely 72 species, one for each possible value of building- 
blocks, see Fig. |3] 

We have computed the concentrations of the assembly intermediates in thermodynamic equilibrium 
as shown in Fig. [S] 

In particular, one observes that concentrations arc highest at the beginning and at the end of the 
assembly pathways, and are strictly and rapidly decreasing (at the beginning) or increasing (at the 
end). For the intermediates at the start of the assembly tree shown in Fig. 0]one observes furthermore 
the following scenario: in the cases where more than one intermediate of the same number of building 
blocks exists (such as for example 6,01 and 6,02, or, 7,01 and 7,02) their concentrations are either 
identical as a consequence of degeneracy (such as for 6,01 and 6,02 and all other nodes springing 
out from a parent node), or vary strongly (7,02 having a larger concentration than 7,01). In the 
latter case, we indicate the intermediate with the larger concentration by a double circle in Fig. 
0J Our computations show furthermore that for comparable values of probabilities towards the end 
of the pathways, different intermediates with the same number of building blocks (and hence all 
assembly pathways) are indistinguishable. Therefore, the start of the assembly pathways singles out 
the dominant pathway (s). Since dead-ends and traps do not occur in the assembly tree, the pathways 
containing the two largely dominant configurations 7, 02 and 14, 02 in Fig. 0]must be the dominant 
pathways during assembly. 

The geometries of the intermediates 7, 02 and 14, 02 are shown in Fig. |TjJ One observes that in 
each case, the dominant configuration is obtained from the previous intermediate via the formation of 

4 For pentamers in solution we do not distinguish between the two different types of building blocks in Fig. IH1 as 
their C-terminal arms are dangling freely and the building blocks are a way of encoding local bonding structures when 
bound in the capsid. 

5 Note that this equation relates the association energy a with the equilibrium concentrations of pentamers, [l]eq, 
and hence changes in [l] e <j may be engineered by changing a. The latter can be achieved for example via alterations in 
the polypeptide chain of the proteins (see e.g. IH). 
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Figure 5: SV40 capsid assembly: scatter plot for the concentrations of assembly intermediates, [j], as 
a function of the number of building blocks, rij. The concentrations were normalised with respect to 
that of the fundamental unit, [1]. Notice that more than a species may exist for given rij. The plot 
refers to the situation where [c*] = 10/dVI and a = —0.7kcal/mol. 

bonds with association energy 2a, a and b. The fact that the formation of this constellation of bonds is 
important is corroborated further by the following: We have increased the association energies of the 
bonds a and b individually, and have compared the ratio of final capsids to pentamers in equilibrium. 
In both cases the yield of final capsids has increased, with a stronger increase in response to an 
increase of the association energy related to the bond with association energy b. 

These considerations suggest that SV40 capsid assembly is driven more by the details of the 
association free energies rather than differences in the geometrical entropy associated to rotational 
symmetries of the various species. This fact can be conveniently checked by setting to zero all as- 
sociation free energies, a, b and c and computing the contribution of the factors Si and S n to the 
concentrations of the various species. One observes that all assembly intermediates (on different as- 
sembly pathways) of an equal number of building blocks have the same probability, with the exception 
of assembly intermediates that have a discrete rotational symmetry. However, since these symmetry 
corrections appear only at later stages in the pathways (first occurrence at iteration step 30) where 
concentrations are low when energy effects are counted in, they do not need to be considered when 
distinguishing the dominant pathways. The dominant pathways are, therefore, strongly dependent on 
the association free energies. 

In JOj it has been shown that the phase space formed by the ratios of the association constants can 
be partitioned into areas in which the qualitative behaviour of assembly, as encoded in the assembly 
tree, is indistinguishable. In order to explore if similar results occur also for these other areas in 
phase space (and hence for different assembly trees), we have calculated the concentrations of the 
assembly intermediates for a representative of a different area in the partition (area 1 in Fig. 5 [W\). 
It corresponds to the point x = 0.75 and y = 0.45 in phase space. The complete assembly tree consists 
of 281 species, and the start of the assembly tree is shown in Fig. [7| The assembly intermediates with 
the larger concentrations are again marked by double circles. As in the case of SV40 assembly, the 
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Figure 6: The geometries of the intermediates 7,02 and 14,02. 



occurrence of assembly intermediates with concentrations larger that that of the other intermediates 
with the same number of building blocks is related to the formation of bonds with association energies 
2a, a and b. This phenomenon hence seems pertinent to the selection of the dominant pathways, and 
may therefore provide insight into the aspect of the viral assembly which may become target of drugs. 

4 Discussion 

We have demonstrated that a combination of the master equation and the tiling approach allows 
to determine the putative pathways for SV40 capsid assembly and sheds light on the mechanisms 
that drive capsid assembly. In particular, we have demonstrated that the more important assembly 
pathways are those where a particular constellation of bonds is formed at an early stage (see Fig. |HJ). 
Hence, this constellation of bonds could be a possible target for antiviral drug design; for example, 
it suggests to search for a drug that binds to the sites related to these bonds, hence preventing the 
formation of this particular constellation of bonds. 

Moreover, our analysis has shown that SV40 capsid assembly is strongly driven by the details of 
the association free energies of the tiles and is only slighted affected by the entropic aspects associated 
to the different rotational symmetries of the various species. While this may be similar for other DNA 
viruses, it is presumably not the case for RNA viruses due to the interactions between RNA and the 
protein building blocks of the capsids. This fact justifies to neglect other combinatorially possible 
intermediates, which however may be important for other viruses as demonstrated in 0. 
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